134 research outputs found

    Dialogue between social movement activists and a Master's Program in youth and adult education

    Get PDF
    This article presents the results of a qualitative study on the relationship between social movements and a Master's Program in youth and adult education in Bahia, Brazil. It pays particular attention to the importance of antiracism in and the decolonization of the program's curriculum

    Run Generation Revisited: What Goes Up May or May Not Come Down

    Full text link
    In this paper, we revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs (or equivalently, maximizing the average run length) when the runs are allowed to be sorted or reverse sorted. We study the problem in the online setting, both with and without resource augmentation, and in the offline setting. (1) We analyze alternating-up-down replacement selection (runs alternate between sorted and reverse sorted), which was studied by Knuth as far back as 1963. We show that this simple policy is asymptotically optimal. Specifically, we show that alternating-up-down replacement selection is 2-competitive and no deterministic online algorithm can perform better. (2) We give online algorithms having smaller competitive ratios with resource augmentation. Specifically, we exhibit a deterministic algorithm that, when given a buffer of size 4M , is able to match or beat any optimal algorithm having a buffer of size M . Furthermore, we present a randomized online algorithm which is 7/4-competitive when given a buffer twice that of the optimal. (3) We demonstrate that performance can also be improved with a small amount of foresight. We give an algorithm, which is 3/2-competitive, with foreknowledge of the next 3M elements of the input stream. For the extreme case where all future elements are known, we design a PTAS for computing the optimal strategy a run generation algorithm must follow. (4) Finally, we present algorithms tailored for nearly sorted inputs which are guaranteed to have optimal solutions with sufficiently long runs

    Linear regression for numeric symbolic variables: an ordinary least squares approach based on Wasserstein Distance

    Full text link
    In this paper we present a linear regression model for modal symbolic data. The observed variables are histogram variables according to the definition given in the framework of Symbolic Data Analysis and the parameters of the model are estimated using the classic Least Squares method. An appropriate metric is introduced in order to measure the error between the observed and the predicted distributions. In particular, the Wasserstein distance is proposed. Some properties of such metric are exploited to predict the response variable as direct linear combination of other independent histogram variables. Measures of goodness of fit are discussed. An application on real data corroborates the proposed method

    Assortment optimisation under a general discrete choice model: A tight analysis of revenue-ordered assortments

    Full text link
    The assortment problem in revenue management is the problem of deciding which subset of products to offer to consumers in order to maximise revenue. A simple and natural strategy is to select the best assortment out of all those that are constructed by fixing a threshold revenue π\pi and then choosing all products with revenue at least π\pi. This is known as the revenue-ordered assortments strategy. In this paper we study the approximation guarantees provided by revenue-ordered assortments when customers are rational in the following sense: the probability of selecting a specific product from the set being offered cannot increase if the set is enlarged. This rationality assumption, known as regularity, is satisfied by almost all discrete choice models considered in the revenue management and choice theory literature, and in particular by random utility models. The bounds we obtain are tight and improve on recent results in that direction, such as for the Mixed Multinomial Logit model by Rusmevichientong et al. (2014). An appealing feature of our analysis is its simplicity, as it relies only on the regularity condition. We also draw a connection between assortment optimisation and two pricing problems called unit demand envy-free pricing and Stackelberg minimum spanning tree: These problems can be restated as assortment problems under discrete choice models satisfying the regularity condition, and moreover revenue-ordered assortments correspond then to the well-studied uniform pricing heuristic. When specialised to that setting, the general bounds we establish for revenue-ordered assortments match and unify the best known results on uniform pricing.Comment: Minor changes following referees' comment

    Change in BMI Accurately Predicted by Social Exposure to Acquaintances

    Get PDF
    Research has mostly focused on obesity and not on processes of BMI change more generally, although these may be key factors that lead to obesity. Studies have suggested that obesity is affected by social ties. However these studies used survey based data collection techniques that may be biased toward select only close friends and relatives. In this study, mobile phone sensing techniques were used to routinely capture social interaction data in an undergraduate dorm. By automating the capture of social interaction data, the limitations of self-reported social exposure data are avoided. This study attempts to understand and develop a model that best describes the change in BMI using social interaction data. We evaluated a cohort of 42 college students in a co-located university dorm, automatically captured via mobile phones and survey based health-related information. We determined the most predictive variables for change in BMI using the least absolute shrinkage and selection operator (LASSO) method. The selected variables, with gender, healthy diet category, and ability to manage stress, were used to build multiple linear regression models that estimate the effect of exposure and individual factors on change in BMI. We identified the best model using Akaike Information Criterion (AIC) and R[superscript 2]. This study found a model that explains 68% (p<0.0001) of the variation in change in BMI. The model combined social interaction data, especially from acquaintances, and personal health-related information to explain change in BMI. This is the first study taking into account both interactions with different levels of social interaction and personal health-related information. Social interactions with acquaintances accounted for more than half the variation in change in BMI. This suggests the importance of not only individual health information but also the significance of social interactions with people we are exposed to, even people we may not consider as close friends.MIT Masdar ProgramMIT Media Lab Consortiu

    The time-profile of cell growth in fission yeast: model selection criteria favoring bilinear models over exponential ones

    Get PDF
    BACKGROUND: There is considerable controversy concerning the exact growth profile of size parameters during the cell cycle. Linear, exponential and bilinear models are commonly considered, and the same model may not apply for all species. Selection of the most adequate model to describe a given data-set requires the use of quantitative model selection criteria, such as the partial (sequential) F-test, the Akaike information criterion and the Schwarz Bayesian information criterion, which are suitable for comparing differently parameterized models in terms of the quality and robustness of the fit but have not yet been used in cell growth-profile studies. RESULTS: Length increase data from representative individual fission yeast (Schizosaccharomyces pombe) cells measured on time-lapse films have been reanalyzed using these model selection criteria. To fit the data, an extended version of a recently introduced linearized biexponential (LinBiExp) model was developed, which makes possible a smooth, continuously differentiable transition between two linear segments and, hence, allows fully parametrized bilinear fittings. Despite relatively small differences, essentially all the quantitative selection criteria considered here indicated that the bilinear model was somewhat more adequate than the exponential model for fitting these fission yeast data. CONCLUSION: A general quantitative framework was introduced to judge the adequacy of bilinear versus exponential models in the description of growth time-profiles. For single cell growth, because of the relatively limited data-range, the statistical evidence is not strong enough to favor one model clearly over the other and to settle the bilinear versus exponential dispute. Nevertheless, for the present individual cell growth data for fission yeast, the bilinear model seems more adequate according to all metrics, especially in the case of wee1Δ cells

    Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

    Get PDF
    BACKGROUND: Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. RESULTS: A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. CONCLUSION: Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently

    Lazy Lasso for local regression

    Get PDF
    Locally weighted regression is a technique that predicts the response for new data items from their neighbors in the training data set, where closer data items are assigned higher weights in the prediction. However, the original method may suffer from overfitting and fail to select the relevant variables. In this paper we propose combining a regularization approach with locally weighted regression to achieve sparse models. Specifically, the lasso is a shrinkage and selection method for linear regression. We present an algorithm that embeds lasso in an iterative procedure that alternatively computes weights and performs lasso-wise regression. The algorithm is tested on three synthetic scenarios and two real data sets. Results show that the proposed method outperforms linear and local models for several kinds of scenario

    Training compliance control yields improvements in drawing as a function of beery scores

    Get PDF
    Many children have difficulty producing movements well enough to improve in sensori-motor learning. Previously, we developed a training method that supports active movement generation to allow improvement at a 3D tracing task requiring good compliance control. Here, we tested 7–8 year old children from several 2nd grade classrooms to determine whether 3D tracing performance could be predicted using the Beery VMI. We also examined whether 3D tracing training lead to improvements in drawing. Baseline testing included Beery, a drawing task on a tablet computer, and 3D tracing. We found that baseline performance in 3D tracing and drawing co-varied with the visual perception (VP) component of the Beery. Differences in 3D tracing between children scoring low versus high on the Beery VP replicated differences previously found between children with and without motor impairments, as did post-training performance that eliminated these differences. Drawing improved as a result of training in the 3D tracing task. The training method improved drawing and reduced differences predicted by Beery scores

    An Information Theory Approach to Hypothesis Testing in Criminological Research

    Full text link
    Background: This research demonstrates how the Akaike information criterion (AIC) can be an alternative to null hypothesis significance testing in selecting best fitting models. It presents an example to illustrate how AIC can be used in this way. Methods: Using data from Milwaukee, Wisconsin, we test models of place-based predictor variables on street robbery and commercial robbery. We build models to balance explanatory power and parsimony. Measures include the presence of different kinds of businesses, together with selected age groups and social disadvantage. Results: Models including place-based measures of land use emerged as the best models among the set of tested models. These were superior to models that included measures of age and socioeconomic status. The best models for commercial and street robbery include three measures of ordinary businesses, liquor stores, and spatial lag. Conclusions: Models based on information theory offer a useful alternative to significance testing when a strong theoretical framework guides the selection of model sets. Theoretically relevant ‘ordinary businesses’ have a greater influence on robbery than socioeconomic variables and most measures of discretionary businesses
    • …
    corecore